A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

نویسندگان

  • Nian Liu
  • Junwei Han
چکیده

—Traditional saliency models usually adopt hand-crafted image features and human-designed mechanisms to calculate local or global contrast. In this paper, we propose a novel computational saliency model, i.e., deep spatial contextual long-term recurrent convolutional network (DSCLRCN) to predict where people looks in natural scenes. DSCLRCN first automatically learns saliency related local features on each image location in parallel. Then, in contrast with most other deep network based saliency models which infer saliency in local contexts, DSCLRCN can mimic the cortical lateral inhibition mechanisms in human visual system to incorporate global contexts to assess the saliency of each image location by leveraging the deep spatial long short-term memory (DSLSTM) model. Moreover, we also integrate scene context modulation in DSLSTM for saliency inference, leading to a novel deep spatial contextual LSTM (DSCLSTM) model. The whole network can be trained end-to-end and works efficiently when testing. Experimental results on two benchmark datasets show that DSCLRCN can achieve state-of-the-art performance on saliency detection. Furthermore, the proposed DSCLSTM model can significantly boost the saliency detection performance by incorporating both global spatial interconnections and scene context modulation, which may uncover novel inspirations for studies on them in computational saliency models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deeply Supervised 3D Recurrent FCN for Salient Object Detection in Videos

This paper presents a novel end-to-end 3D fully convolutional network for salient object detection in videos. The proposed network uses 3D filters in the spatiotemporal domain to directly learn both spatial and temporal information to have 3D deep features, and transfers the 3D deep features to pixel-level saliency prediction, outputting saliency voxels. In our network, we combine the refinemen...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Agile Amulet: Real-Time Salient Object Detection with Contextual Attention

This paper proposes an Agile Aggregating Multi-Level feaTure framework (Agile Amulet) for salient object detection. The Agile Amulet builds on previous works to predict saliency maps using multi-level convolutional features. Compared to previous works, Agile Amulet employs some key innovations to improve training and testing speed while also increase prediction accuracy. More specifically, we f...

متن کامل

Improving Deep Pancreas Segmentation in CT and MRI Images via Recurrent Neural Contextual Learning and Direct Loss Function

Deep neural networks have demonstrated very promising performance on accurate segmentation of challenging organs (e.g., pancreas) in abdominal CT and MRI scans. The current deep learning approaches conduct pancreas segmentation by processing sequences of 2D image slices independently through deep, dense per-pixel masking for each image, without explicitly enforcing spatial consistency constrain...

متن کامل

An efficient method for cloud detection based on the feature-level fusion of Landsat-8 OLI spectral bands in deep convolutional neural network

Cloud segmentation is a critical pre-processing step for any multi-spectral satellite image application. In particular, disaster-related applications e.g., flood monitoring or rapid damage mapping, which are highly time and data-critical, require methods that produce accurate cloud masks in a short time while being able to adapt to large variations in the target domain (induced by atmospheric c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1610.01708  شماره 

صفحات  -

تاریخ انتشار 2016